An Experiment in Integrating Internet Information Sources
نویسندگان
چکیده
The number of online information sources is growing rapidly. Though much of this information is unstructured (e.g., text, images) the number of structured information sources (e.g., databases) is also increasing. The advantage of structured information sources is that we can actually query their contents and use them to answer queries, perhaps in combination with other structured sources, rather than just viewing the whole information source. Given the variability in the information sources available, it is impractical for a user to interact with each source using its specific terminology. As a consequence, several systems (such as Nomenclator [Ordille and Miller, 1993b; Ordille, 1994], The Information Manifold [Kirk et al., 1995], TSIMMIS [Chawathe et al., 1994], SIMS [Arens et al., 1994]) have been developed based on the notion of a mediator [Wiederhold, 1992]. The key idea behind a mediator is that a user interacts with one uniform model of the domain, and the mediator translates between queries posed in the domain model, and the ontologies of the specific information sources. In order to do so, the mediator needs descriptions of the specific information sources that relate their contents to the domain model seen by the user. A major open issue which is crucial to the success of applying the notion of a mediator to large numbers of networked information sources is the ability to obtain accurate descriptions of the sources. The difficulty lies in problems such as sources using different attribute names to refer to the same intended attribute, or using the same attribute name to refer to different intended attributes. Further complications arise when the format of the attributes differs, even if they mean the same thing. An important characteristic of many networked information sources is that they fall into classes that model very similar domains. For example, a large class of information sources describe data about people in an organization, such as their addresses, phones, and email addresses. Similarly, many information sources contain bibliographic citations, or product and price listings. This paper describes an approach to the problem of obtaining descriptions of information sources that is well suited for networked information sources of this type, and the first large scale experiment that we are currently conducting, aimed at obtaining such descriptions in a real world environment. Our approach is based on the assumption that obtaining descriptions of information sources is a task that will inherently involve the managers of the information sources themselves, and the key to the solution of the problem is to find ways in which they can provide the most accurate description of their sources with minimal effort.
منابع مشابه
Educational Needs of People in Mazandaran Province about COVID-19 in 2020: An Internet-Based Study
Background and purpose: Considering the global spread of COVID-19, it seems that correct information obtained from reliable sources and training based on the need for self-care behaviors are useful solutions to reduce the harm caused by the disease. This study aimed at assessing the educational needs of people in Mazandaran province about COVID-19. Materials and methods: In a cross-sectional ...
متن کاملSources of anxiety during information seeking process
Background and Aim: the current study aims to determine factors which may cause negative feeling such as fear, uncertainty and anxiety during information seeking process. Method: In this review paper, different library resources and databases were searched in the areas of library anxiety, Internet anxiety, computer anxiety, information seeking, information searching and information retrieval t...
متن کاملSemi-Automatic Wrapper Generation for Internet Information Sources
To simplify the task of obtaining information from the vast number of information sources that are available on the World Wide Web (WWW), we are building tools to build information mediators for extracting and integrating data from multiple Web sources. In a mediator based approach, wrappers are built around individual information sources, that provide translation between the mediator query lan...
متن کاملHealth Information Seeking Behavior of Graduate Students Linked to Corona Virus at Qom University
Objective: Health information on diseases could help prevent the spread and the treatment and is the most vital needs of people in daily life. One health issue that has plagued the world in recent years is the corona virus. Therefore, the main purpose of this study was to investigate the health information behavior of graduate students at Qom University. Methodology: Applied descriptive survey...
متن کاملSingle-agent and Multi-agent Approaches to WWW Information Integration
The WWW is a most popular service on the Internet and a huge number of WWW information sources are available. Conventionally we access WWW information sources one by one by using a browser, but WWW information integration gives a uni ed view to users by integrating multiple WWW information sources elaborately. In this paper, we introduce our single-agent and multi-agent approaches to WWW inform...
متن کاملA Semantic Approach to Integrating XML and Structured Data
XML is fast becoming the standard for information exchange on the Internet As such information expressed in XML will need to be integrated with existing information systems which are mostly based on structured data models such as relational object oriented or ob ject relational data models This paper shows how our previous framework for integrating heterogeneous structured data sources can also...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1995